Automatic de-identification of protected health information
نویسندگان
چکیده
This paper presents an automatic de-identification system for Serbian, grounded on a rapid adaptation of the existing named entity recognition system. Based on a finite-state methodology and lexical resources, the system is designed to detect and replace all explicit personal protected health information present in the medical narrative texts, while still preserving all the relevant medical concepts. The results of a preliminary evaluation demonstrate the usefulness of this method both in preserving patient privacy and the deidentified document interoperability. Avtomatska dezidentifikacija zaščitenih zdravstvenih podatkov V prispevku predstavimo sistem za avtomatsko dezidentifikacijo v srbščini, ki temelji na hitri prilagoditvi obstoječega sistema za identifikacijo imenskih entitet. Sistem je zasnovan na metodologiji končnih avtomatov in jezikovnih virov ter identificira in zamenja vse eksplicitne zaščitene zdravstvene osebne podatke v medicinskih narativnih besedilih, pri čemer pa ohrani relevantne medicinske koncepte. Rezultati preliminarne evalvacije so pokazali uporabnost te metode, in sicer tako pri zaščiti osebnih podatkov pacientov kot pri interoperabilnosti dezindentificiranih dokumentov.
منابع مشابه
An Automatic System to Detect and Extract Text in Medical Images for De-identification
Recently, there is an increasing need to share medical images for research purpose. In order to respect and preserve patient privacy, most of the medical images are de-identified with protected health information (PHI) before research sharing. Since manual de-identification is time-consuming and tedious, so an automatic de-identification system is necessary and helpful for the doctors to remove...
متن کاملAutomatic detection of protected health information from clinic narratives
This paper presents a natural language processing (NLP) system that was designed to participate in the 2014 i2b2 de-identification challenge. The challenge task aims to identify and classify seven main Protected Health Information (PHI) categories and 25 associated sub-categories. A hybrid model was proposed which combines machine learning techniques with keyword-based and rule-based approaches...
متن کاملModes of De-identification
De-identification of protected health information is an essential method for protecting patient privacy. Most institutes require de-identification of patient data prior to conducting scientific studies; therefore, it is important for clinical scientists to be cognizant of all modes of de-identification and all services provided by their de-identification tools. In this article, we discuss eight...
متن کاملPreparing a collection of radiology examinations for distribution and retrieval
OBJECTIVE Clinical documents made available for secondary use play an increasingly important role in discovery of clinical knowledge, development of research methods, and education. An important step in facilitating secondary use of clinical document collections is easy access to descriptions and samples that represent the content of the collections. This paper presents an approach to developin...
متن کاملLearning to Recognize Protected Health Information in Electronic Health Records with Recurrent Neural Network
De-identification in electronic health records is a prerequisite to distribute medical records for further clinical data processing or mining. In this paper, we introduce a framework based on recurrent neural network to solve the de-identification problem, and compare state-of-the-art methods with our framework. It is integrated, which includes records skeleton generation, chunk representation ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2014